AITopics | sequence classification

Collaborating Authors

sequence classification

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

metric

Neural Information Processing SystemsApr-25-2026, 23:44:35 GMT

Dynabench comprises four dynamic tasks with multiple rounds of datasets that will grow over time. Given that here we have to be able to evaluate a wide variety of models, both in the loop and outside of it, we employ a black box post hoc approach, i.e., one that can be applied post-data collection to existing data, on any uploaded model, without requiring anything other than its predictions. One straightforward way to measure fairness then, is to apply clearly delimited, heuristic perturbations to existing evaluation datasets, and measure whether performance drops. Such an approach is similar to recent works that use grammars to heuristically generate pairs of examples varying in gender [58] and/or race [67] in that they utilize predefined lists of words. However, because we also want to ensure minimal consequences on our classification labels, we adopted an approach that is more targeted than grammars and also preserves the original input data distribution: we replace each word in the input data that has a clear signal about race/ethnicity and/or gender identity with a similar word referring to another group, rerun inference, and measure how many labels flipped (i.e., the difference in microaverage accuracy).

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Industry: Transportation (0.37)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.98)
Information Technology > Artificial Intelligence > Natural Language (0.73)

Add feedback

Efficient Approximation Algorithms for Strings Kernel Based Sequence Classification

Neural Information Processing SystemsNov-21-2025, 16:08:24 GMT

Sequence classification algorithms, such as SVM, require a definition of distance (similarity) measure between two sequences. A commonly used notion of similarity is the number of matches between k-mers (k-length subsequences) in the two sequences. Extending this definition, by considering two k-mers to match if their distance is at most m, yields better classification performance. This, however, makes the problem computationally much more complex. Known algorithms to compute this similarity have computational complexity that render them applicable only for small values of k and m.

efficient approximation algorithm, name change, string kernel, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Efficient Approximation Algorithms for Strings Kernel Based Sequence Classification

Muhammad Farhan, Juvaria Tariq, Arif Zaman, Mudassir Shabbir, Imdad Ullah Khan

Neural Information Processing SystemsNov-21-2025, 13:16:38 GMT

This, however, makes the problem computationally much more complex.

bioinformatics, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Asia > Pakistan > Punjab > Lahore Division > Lahore (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New Jersey (0.04)
(2 more...)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Biomedical Informatics (0.94)

Add feedback

XAI-Driven Deep Learning for Protein Sequence Functional Group Classification

Chakraborty, Pratik, Bhargava, Aryan

arXiv.org Artificial IntelligenceNov-19-2025

Proteins perform essential biological functions, and accurate classification of their sequences is critical for understanding structure-function relationships, enzyme mechanisms, and molecular interactions. This study presents a deep learning-based framework for functional group classification of protein sequences derived from the Protein Data Bank (PDB). Four architectures were implemented: Convolutional Neural Network (CNN), Bidirectional Long Short-Term Memory (BiLSTM), CNN-BiLSTM hybrid, and CNN with Attention. Each model was trained using k-mer integer encoding to capture both local and long-range dependencies. Among these, the CNN achieved the highest validation accuracy of 91.8%, demonstrating the effectiveness of localized motif detection. Explainable AI techniques, including Grad-CAM and Integrated Gradients, were applied to interpret model predictions and identify biologically meaningful sequence motifs. The discovered motifs, enriched in histidine, aspartate, glutamate, and lysine, represent amino acid residues commonly found in catalytic and metal-binding regions of transferase enzymes. These findings highlight that deep learning models can uncover functionally relevant biochemical signatures, bridging the gap between predictive accuracy and biological interpretability in protein sequence analysis.

artificial intelligence, classification, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.13791

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Education > Health & Safety > School Nutrition (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Metadata-Aligned 3D MRI Representations for Contrast Understanding and Quality Control

Avci, Mehmet Yigit, Borges, Pedro, Fernandez, Virginia, Wright, Paul, Yigitsoy, Mehmet, Ourselin, Sebastien, Cardoso, Jorge

arXiv.org Artificial IntelligenceNov-4-2025

Magnetic Resonance Imaging suffers from substantial data heterogeneity and the absence of standardized contrast labels across scanners, protocols, and institutions, which severely limits large-scale automated analysis. A unified representation of MRI contrast would enable a wide range of downstream utilities, from automatic sequence recognition to harmonization and quality control, without relying on manual annotations. To this end, we introduce MR-CLIP, a metadata-guided framework that learns MRI contrast representations by aligning volumetric images with their DICOM acquisition parameters. The resulting embeddings shows distinct clusters of MRI sequences and outperform supervised 3D baselines under data scarcity in few-shot sequence classification. Moreover, MR-CLIP enables unsupervised data quality control by identifying corrupted or inconsistent metadata through image-metadata embedding distances. By transforming routinely available acquisition metadata into a supervisory signal, MR-CLIP provides a scalable foundation for label-efficient MRI analysis across diverse clinical datasets.

artificial intelligence, machine learning, metadata, (14 more...)

arXiv.org Artificial Intelligence

2511.00681

Country:

Europe > United Kingdom (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.30)

Add feedback

A Neuro-Symbolic Framework for Sequence Classification with Relational and Temporal Knowledge

Lorello, Luca Salvatore, Lippi, Marco, Melacci, Stefano

arXiv.org Artificial IntelligenceMay-9-2025

One of the goals of neuro-symbolic artificial intelligence is to exploit background knowledge to improve the performance of learning tasks. However, most of the existing frameworks focus on the simplified scenario where knowledge does not change over time and does not cover the temporal dimension. In this work we consider the much more challenging problem of knowledge-driven sequence classification where different portions of knowledge must be employed at different timesteps, and temporal relations are available. Our experimental evaluation compares multi-stage neuro-symbolic and neural-only architectures, and it is conducted on a newly-introduced benchmarking framework. Results demonstrate the challenging nature of this novel setting, and also highlight under-explored shortcomings of neuro-symbolic methods, representing a precious reference for future research.

logic & formal reasoning, machine learning, test accuracy, (19 more...)

arXiv.org Artificial Intelligence

2505.05106

Country:

North America (0.28)
Europe (0.28)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.92)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)

Add feedback

Can Large Language Models Predict Antimicrobial Resistance Gene?

Yoo, Hyunwoo

arXiv.org Artificial IntelligenceMar-6-2025

This study demonstrates that generative large language models can be utilized in a more flexible manner for DNA sequence analysis and classification tasks compared to traditional transformer encoder-based models. While recent encoder-based models such as DNABERT and Nucleotide Transformer have shown significant performance in DNA sequence classification, transformer decoder-based generative models have not yet been extensively explored in this field. This study evaluates how effectively generative Large Language Models handle DNA sequences with various labels and analyzes performance changes when additional textual information is provided. Experiments were conducted on antimicrobial resistance genes, and the results show that generative Large Language Models can offer comparable or potentially better predictions, demonstrating flexibility and accuracy when incorporating both sequence and textual information. The code and data used in this work are available at the following GitHub repository: https://github.com/biocomgit/llm4dna.

dna sequence, language model, sequence, (14 more...)

arXiv.org Artificial Intelligence

2503.04413

Genre: Research Report > New Finding (0.89)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neuromorphic Spiking Neural Network Based Classification of COVID-19 Spike Sequences

Murad, Taslim, Chourasia, Prakash, Ali, Sarwan, Khan, Imdad Ullah, Patterson, Murray

arXiv.org Artificial IntelligenceDec-19-2024

The availability of SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) virus data post-COVID has reached exponentially to an enormous magnitude, opening research doors to analyze its behavior. Various studies are conducted by researchers to gain a deeper understanding of the virus, like genomic surveillance, etc, so that efficient prevention mechanisms can be developed. However, the unstable nature of the virus (rapid mutations, multiple hosts, etc) creates challenges in designing analytical systems for it. Therefore, we propose a neural network-based (NN) mechanism to perform an efficient analysis of the SARS-CoV-2 data, as NN portrays generalized behavior upon training. Moreover, rather than using the full-length genome of the virus, we apply our method to its spike region, as this region is known to have predominant mutations and is used to attach to the host cell membrane. In this paper, we introduce a pipeline that first converts the spike protein sequences into a fixed-length numerical representation and then uses Neuromorphic Spiking Neural Network to classify those sequences. We compare the performance of our method with various baselines using real-world SARS-CoV-2 spike sequence data and show that our method is able to achieve higher predictive accuracy compared to the recent baselines.

artificial intelligence, machine learning, sequence, (17 more...)

arXiv.org Artificial Intelligence

2501.14746

Country: Asia > Pakistan > Punjab (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

EPIC: Enhancing Privacy through Iterative Collaboration

Chourasia, Prakash, Lonkar, Heramb, Ali, Sarwan, Patterson, Murray

arXiv.org Artificial IntelligenceNov-7-2024

Advancements in genomics technology lead to a rising volume of viral (e.g., SARS-CoV-2) sequence data, resulting in increased usage of machine learning (ML) in bioinformatics. Traditional ML techniques require centralized data collection and processing, posing challenges in realistic healthcare scenarios. Additionally, privacy, ownership, and stringent regulation issues exist when pooling medical data into centralized storage to train a powerful deep learning (DL) model. The Federated learning (FL) approach overcomes such issues by setting up a central aggregator server and a shared global model. It also facilitates data privacy by extracting knowledge while keeping the actual data private. This work proposes a cutting-edge Privacy enhancement through Iterative Collaboration (EPIC) architecture. The network is divided and distributed between local and centralized servers. We demonstrate the EPIC approach to resolve a supervised classification problem to estimate SARS-CoV-2 genomic sequence data lineage without explicitly transferring raw sequence data. We aim to create a universal decentralized optimization framework that allows various data holders to work together and converge to a single predictive model. The findings demonstrate that privacy-preserving strategies can be successfully used with aggregation approaches without materially altering the degree of learning convergence. Finally, we highlight a few potential issues and prospects for study in FL-based approaches to healthcare applications.

artificial intelligence, bioinformatics, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2411.05167

Country:

North America > United States > California (0.14)
Europe > United Kingdom > Scotland (0.05)
Europe > Sweden (0.05)
(10 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Filters

Collaborating Authors

sequence classification

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

metric

55b1927fdafef39c48e5b73b5d61ea60-Supplemental.pdf

Efficient Approximation Algorithms for Strings Kernel Based Sequence Classification

Efficient Approximation Algorithms for Strings Kernel Based Sequence Classification

XAI-Driven Deep Learning for Protein Sequence Functional Group Classification

Metadata-Aligned 3D MRI Representations for Contrast Understanding and Quality Control

A Neuro-Symbolic Framework for Sequence Classification with Relational and Temporal Knowledge

Can Large Language Models Predict Antimicrobial Resistance Gene?

Neuromorphic Spiking Neural Network Based Classification of COVID-19 Spike Sequences

EPIC: Enhancing Privacy through Iterative Collaboration